Word Sense Disambiguation Using Inductive Logic Programming

نویسندگان

  • Lucia Specia
  • Ashwin Srinivasan
  • Ganesh Ramakrishnan
  • Maria das Graças Volpe Nunes
چکیده

The identification of the correct sense of a word is necessary for many tasks in automatic natural language processing like machine translation, information retrieval, speech and text processing. Automatic Word Sense Disambiguation (WSD) is difficult and accuracies with state-of-the art methods are substantially lower than in other areas of text understanding like part-ofspeech tagging. One shortcoming of these methods is that they do not utilize substantial sources of background knowledge, such as semantic taxonomies and dictionaries, which are now available in electronic form (the methods largely use shallow syntactic features). Empirical results from the use of Inductive Logic Programming (ILP) have repeatedly shown the ability of ILP systems to use diverse sources of background knowledge. In this paper we investigate the use of ILP for WSD in two different ways: (a) as a stand-alone constructor of models for WSD; and (b) to build interesting features, which can then be used by standard model-builders such as SVM. Our investigation is in the form of experiments that examine a monolingual WSD task using the 32 English verbs contained in the SENSEVAL-3 benchmark data; and a bilingual WSD task using 7 highly ambiguous verbs in machine translation from English to Portuguese. Background knowledge available is from eight sources that provide a wide range of syntactic and semantic information. For both WSD tasks, experimental results show that ILP-constructed models and models built using ILP-generated features have higher accuracies than those obtained using a state-of-the art feature-based technique equipped with shallow syntactic features. This suggests that the use of ILP with diverse sources of background knowledge can provide one way for making substantial progress in the field of

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Relational Approach for Word Sense Disambiguation

We propose a novel approach for word sense disambiguation which makes use of corpus-based evidence combined with background knowledge. Using an inductive logic programming technique, it generates expressive models which exploit several knowledge sources and also the relations between them. The approach is evaluated in two tasks: identification of the correct translation for verbs in English-Por...

متن کامل

Learning Expressive Models for Word Sense Disambiguation

We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm, the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks: identification of the...

متن کامل

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to le...

متن کامل

A Hybrid Relational Approach for WSD - First Results

We present a novel hybrid approach for Word Sense Disambiguation (WSD) which makes use of a relational formalism to represent instances and background knowledge. It is built using Inductive Logic Programming techniques to combine evidence coming from both sources during the learning process, producing a rule-based WSD model. We experimented with this approach to disambiguate 7 highly ambiguous ...

متن کامل

USP-IBM-1 and USP-IBM-2: The ILP-based Systems for Lexical Sample WSD in SemEval-2007

We describe two systems participating of the English Lexical Sample task in SemEval2007. The systems make use of Inductive Logic Programming for supervised learning in two different ways: (a) to build Word Sense Disambiguation (WSD) models from a rich set of background knowledge sources; and (b) to build interesting features from the same knowledge sources, which are then used by a standard mod...

متن کامل

Learning part of speech disambiguation rules using Inductive Logic Programming

A pilot study on inducing rules for part of speech tagging of unrestricted Swedish text is reported. Using the Progol machine-learning system, Constraint Grammar inspired rules were learnt from the part of speech tagged Stockholm-Ume a Corpus. Several thousand disambiguation rules discarding faulty readings of ambiguously tagged words were induced. When tested on unseen data, 97% of the words r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006